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I.  INTRODUCTION 


Multiplication  of  the  complex  numbers  x  and  y,  where 

x  =  a+jb  and  y  =  c+jd  requires  the  computation  of  ac-bd  and 
ad+bc.  If  computed  directly,  this  requires  four  real 
multiplications  and  two  real  additions.  It  is  well  known,  as 
frequently  attributed  to  Golub,  that  the  identity 
xy  =  ( ac-bd) +j (ad+bc) 

=  (a(c-d)+(a-b)d)+j ( b( c+d)+( a-b)d)  (1) 

could  be  used,  requiring  three  real  multiplications  and  five  real 

additions  instead.  This  identity  could  result  as  a  special  case 

of  application  of  an  efficient  algorithm  of  polynomial 

multiplication  as  discussed  elegantly  by  Winograd  in  [1],  page 

18.  Let  a  real  multiplication  be  computationally  equivalent  to  r 

real  additions.  Clearly,  application  of  (1)  is  of  interest  only 

if  r>3,  as  indicated  by  Moharir  in  [2].  With  the  advent  of 

distributed  computing,  and  the  increased  computational  power 

available  on  individual  VLSI  chips,  the  value  of  r  approaches 

unity  in  some  cases.  This  is  the  case,  for  example,  in 

applications  where  the  predominant  factor  in  the  computational 

cost  is  that  of  the  I/O  requirements  and  data  manipulation. 

An  important  field  in  which  multiplication  is  inherently 

more  costly  than  addition  is  that  of  matrix  arithmetic.  For  nxn 

real  matrices,  a  multiplication  requires  0(n3)  operations,  while 
2 

only  o;n  )  are  needed  for  addition.  Fortunately,  commutativity  is 
not  required  for  (1)  to  hold,  and  (1)  is  therefore  applicable  to 
complex  matrices  with  compatible  dimensions.  In  Section  II,  the 
case  of  square  complex  matrices  is  considered  where  application 


1 


of  (1)  is  shown  to  result  in  saving  up  to  1/4  of  the 


computations,  even  if  r  =  1. 

The  three  additions  in  (1)  depend  on  either  x  or  y,  but  not 
both.  The  quantity  (a-b)  depends  only  on  x,  while  (c+d)  and  (c-d) 
require  only  y.  Such  computations  have  the  desirable  feature  that 
they  do  not  require  data  communication  to  combine  x  and  y.  In 
addition,  if  either  x  or  y  is  fixed,  such  quantities  could  be 
precomputed  only  once.  There  is  an  asymmetry  in  above  quantities 
since  only  one  of  them  depends  on  x,  while  the  other  two  depend 
on  y.  This  asymmetry  suggests  the  existence  of  a  dual  form,  where 
the  roles  of  x  and  y  are  interchanged,  but  without  requiring 
commutativity.  This  form  is 

xy  =  ( (a-b)c+b(c-d) )+j ( ( a+b)d+b( c-d) )  (2) 

which  is  of  importance  for  rectangular  matrices  and  applications 
with  fixed  data  as  discussed  in  Section  III.  The  Conclusion  in 
Section  IV,  Comments  on  some  applications  and  on  the  possibility 
of  combining  this  work  with  other  matrix  multiplication 
algorithms  are  presented. 

II.  SQUARE  MATRICES 


In  this  section,  the  x  and  y  of  (1)  and  (2)  represent  nxn 
complex  matrices.  Direct  multiplication  of  x  and  y  requires  A 
real  additions  and  M  real  multiplications,  where 
A  =  2n2+4n2 ( n-1 )  =  4n3-2n2, 

M  =  4n3  (3) 

On  the  other  hand  using  (1)  or  (2)  requires 
A  =  5n2+3n2 ( n-1 )  =  3n3+2n2, 

M  =  3n3  (4) 


\  „  i* .  <**/  "«.* 
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(5) 


The  cost  of  computing  xy  in  equivalent  additions  is 
C  =  4n3 ( 1+r ) -2n2 
for  direct  computation,  and 

C'=  3n3 ( 1+r ) +2n2  (6) 

for  computation  using  (1)  or  (2).  To  compare  the  two  approaches 
we  use  either 

S  =  (C-C')/C  =  ( n( 1+r ) -4 ) / ( 4n( 1+r ) -2 )  (7) 

which  represents  the  relative  reduction  in  computational  cost 
when  (1)  or  (2)  is  used  for  complex  square  matrix  multiplication, 
or 

R  =  1-S  =  C’/C  =  ( 3n( 1+r )+2 ) / ( 4n( 1+r ) -2 )  (8) 

the  ratio  of  their  costs.  It  is  clear  that  as  n(l+r)  increases,  S 
approaches  1/4,  while  R  approaches  3/4.  The  following  table 
should  be  of  value  in  assessing  the  range  of  values  of  r  and  n 
for  which  the  approach  is  of  interest.  The  values  of  R  in  the 
table  are  rounded  to  two  decimal  locations.  Even  if 

multiplication  is  considered  computationally  equivalent  to 

addition,  i.e.  for  r  =  1,  appreciable  savings  are  possible  for 
modest  values  of  n.  For  n  =  2  direct  computation  is  equivalent  to 
the  proposed  approach,  for  which  R  =  .91  and  decreases  further 
with  increasing  n  to  approach  3/4. 
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III.  THE  GENERAL  CASE 


For  the  general  case  of  complex  matrices  x  and  y  of 
dimensions  pxn  and  nxm,  respectively,  direct  computation  requires 

A  =  2pm+4pm( n-1 )  =  4pmn-2pm, 

M  =  4pmn  ( 9 ) 

For  non-square  matrices  (1)  and  (2)  yield  different  results.  If 
(1)  is  used  we  get 

A  =  2pn+mn+2pm+3pm( n-1 )  =  3pmn+2pn+mn-pm, 

M  =  3pmn  (10) 

while  (2)  results  in 

A  =  pn+2mn+2pm+3pm( n-1 )  =  3pmn+pn+2mn-pm  (11) 

and  the  same  value  of  M  =  3pmn.  One  pn  in  the  expression  of  A  in 
(9)  is  replaced  by  an  mn  in  (11).  Therefor,  (1)  should  be  used 

for  matrix  pairs  with  p<m  and  (2)  for  those  with  m<p.  Direct 
computation  requires 

C  =  4pmn( 1+r ) -2pm  (12) 

equivalent  additions,  while  the  proper  choice  of  (1)  or  (2) 
results  in 

C'=  3pmn ( 1+r ) -pm+mn+pn+n( min{ p ,m} )  (13) 

equivalent  additions.  From  (12)  and  (13)  we  obtain  the  ratio 

R  =  C’/C 

=  (  3pmn( 1+r ) -pm+mn+pn+n(min{p,m} ) ) / ( 4pmn( 1+r ) -2pm)  (14) 
which  approaches  3/4  as  pmn(l+r)  increases. 

In  some  cases,  one  of  the  two  matrices  either  remains 
constant  or  changes  infrequently,  while  the  second  matrix 
changes  frequently.  Let  x  be  fixed,  and  y  frequently  changing. 
For  direct  computation,  C  is  the  same  as  in  (13)  since  every 


multiplication  or  addition  involves  at  least  one  element  of  y. 
Computation  based  on  (2)  requires  the  calculation  of  a+b  and  a-b 
which  involves  only  the  elements  of  x  and  could  be  precomputed 
and  are  therefore  not  included  in  assessing  the  computational 
cost  next.  All  other  computations  involve  y  and  require 

A  =  mn+2pm+3pm( n-1 ) , 

M  =  3pmn  (15) 

resulting  in 

C'  =  3pmn( 1+r ) +mn-pm  (16) 


R  =  (3n(l+r)+n/p-l)/(4n(l+r)-2)  (17) 

which  does  not  depend  on  m.  Even  for  r  =  1,  R  in  (17)  is  always 
<1  with  the  limit  R  =  1  attained  for  n  =  p  =  1,  in  which  case  a 
complex  multiplication  costs  three  real  additions  and  three  real 
multiplications . 

The  case  of  an  inner  product  is  of  particular  interest.  For 
p=m=l,  r=l,  and  large  n  we  get  R  =  7/8  from  (17).  This  is 
in  comparison  to  the  case  where  both  x  and  y  are  not  fixed, 
resulting  in  R  =  1  from  (14). 
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IV.  CONCLUSION 


Extension  of  (1)  and  (2)  to  complex  matrices  resulting  in 

computational  savings  of  up  to  1/4  could  be  of  interest  in  a 

variety  of  applications.  In  digital  signal  processing,  inner 

products,  vector-scalar  and  vector-matrix  multiplications  are 

some  times  encountered  with  complex  entries.  This  is  the  case, 

for  example  in  polyphase  filters  and  filter  banks  and  some  signal 

transforms.  This  is  also  the  case  in  radar  and  communication 

applications,  with  digital  processing  in  the  base-band. 

Efficient  algorithms  for  real  matrix  multiplication  could  be 

advantageously  combined  with  this  work.  For  example,  th z 

log  -  7 

coefficient  of  the  0(n  )  algorithm  of  Strassen  in  [3]  would 

be  reduced  by  up  to  1/4. 
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